Discriminative clustering in Fisher metrics

نویسندگان

  • Jarkko Salojärvi
  • Samuel Kaski
  • Janne Sinkkonen
چکیده

Discriminative clustering (DC) finds a Voronoi partitioning of a primary data space that, while consisting of local partitions, simultaneously maximizes information about auxiliary data categories. DC is useful in exploration and in finding more coarse or refined versions of already existing categories. Theoretical results suggest that Voronoi partitions in the socalled Fisher metric would outperform partitions in the Euclidean metric. Here we use a local quadratic approximation of the Fisher metric, derived from a conditional density estimator, in defining the partitions and show that the resulting algorithms outperform the conventional variants.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative Clustering: Optimal Contingency Tables by Learning Metrics

The learning metrics principle describes a way to derive metrics to the data space from paired data. Variation of the primary data is assumed relevant only to the extent it causes changes in the auxiliary data. Discriminative clustering finds clusters of primary data that are homogeneous in the auxiliary data. In this paper, discriminative clustering using a mutual information criterion is show...

متن کامل

Learning metrics and discriminative clustering

In this work methods have been developed to extract relevant information from large, multivariate data sets in a flexible, nonlinear way. The techniques are applicable especially at the initial, explorative phase of data analysis, in cases where an explicit indicator of relevance is available as part of the data set. The unsupervised learning methods, popular in data exploration, often rely on ...

متن کامل

Clustering in Fisher Discriminative Subspaces

Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult problem. This is mainly due to the fact that high-dimensional data usually live in low-dimensional subspaces hidden in the original space. This paper presents a model-based clustering approach which models the data in a discriminative subspace with an intrinsic dimension lowe...

متن کامل

Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the ...

متن کامل

Joint Clustering and Feature Selection

Due to the absence of class labels, unsupervised feature selection is much more difficult than supervised feature selection. Traditional unsupervised feature selection algorithms usually select features to preserve the structure of the data set. Inspired from the recent developments on discriminative clustering, we propose in this paper a novel unsupervised feature selection approach via Joint ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003